{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "# Artificial Intelligence Nanodegree\n",
    "\n",
    "## Convolutional Neural Networks\n",
    "\n",
    "---\n",
    "\n",
    "In your upcoming project, you will download pre-computed bottleneck features.  In this notebook, we'll show you how to calculate VGG-16 bottleneck features on a toy dataset.  Note that unless you have a powerful GPU, computing the bottleneck features takes a significant amount of time.\n",
    "\n",
    "### 1. Load and Preprocess Sample Images\n",
    "\n",
    "Before supplying an image to a pre-trained network in Keras, there are some required preprocessing steps.  You will learn more about this in the project; for now, we have implemented this functionality for you in the first code cell of the notebook.  We have imported a very small dataset of 8 images and stored the  preprocessed image input as `img_input`.  Note that the dimensionality of this array is `(8, 224, 224, 3)`.  In this case, each of the 8 images is a 3D tensor, with shape `(224, 224, 3)`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Using TensorFlow backend.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(8, 224, 224, 3)\n"
     ]
    }
   ],
   "source": [
    "from keras.applications.vgg16 import preprocess_input\n",
    "from keras.preprocessing import image\n",
    "import numpy as np\n",
    "import glob\n",
    "\n",
    "img_paths = glob.glob(\"images/*.jpg\")\n",
    "\n",
    "def path_to_tensor(img_path):\n",
    "    # loads RGB image as PIL.Image.Image type\n",
    "    img = image.load_img(img_path, target_size=(224, 224))\n",
    "    # convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)\n",
    "    x = image.img_to_array(img)\n",
    "    # convert 3D tensor to 4D tensor with shape (1, 224, 224, 3) and return 4D tensor\n",
    "    return np.expand_dims(x, axis=0)\n",
    "\n",
    "def paths_to_tensor(img_paths):\n",
    "    list_of_tensors = [path_to_tensor(img_path) for img_path in img_paths]\n",
    "    return np.vstack(list_of_tensors)\n",
    "\n",
    "# calculate the image input. you will learn more about how this works the project!\n",
    "img_input = preprocess_input(paths_to_tensor(img_paths))\n",
    "\n",
    "print(img_input.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "### 2. Recap How to Import VGG-16\n",
    "\n",
    "Recall how we import the VGG-16 network (including the final classification layer) that has been pre-trained on ImageNet.\n",
    "\n",
    "![VGG-16 model](figures/vgg16.png)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "_________________________________________________________________\n",
      "Layer (type)                 Output Shape              Param #   \n",
      "=================================================================\n",
      "input_1 (InputLayer)         (None, 224, 224, 3)       0         \n",
      "_________________________________________________________________\n",
      "block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      \n",
      "_________________________________________________________________\n",
      "block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     \n",
      "_________________________________________________________________\n",
      "block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         \n",
      "_________________________________________________________________\n",
      "block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     \n",
      "_________________________________________________________________\n",
      "block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    \n",
      "_________________________________________________________________\n",
      "block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         \n",
      "_________________________________________________________________\n",
      "block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    \n",
      "_________________________________________________________________\n",
      "block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    \n",
      "_________________________________________________________________\n",
      "block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    \n",
      "_________________________________________________________________\n",
      "block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         \n",
      "_________________________________________________________________\n",
      "block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   \n",
      "_________________________________________________________________\n",
      "block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   \n",
      "_________________________________________________________________\n",
      "block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   \n",
      "_________________________________________________________________\n",
      "block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         \n",
      "_________________________________________________________________\n",
      "block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   \n",
      "_________________________________________________________________\n",
      "block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   \n",
      "_________________________________________________________________\n",
      "block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   \n",
      "_________________________________________________________________\n",
      "block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         \n",
      "_________________________________________________________________\n",
      "flatten (Flatten)            (None, 25088)             0         \n",
      "_________________________________________________________________\n",
      "fc1 (Dense)                  (None, 4096)              102764544 \n",
      "_________________________________________________________________\n",
      "fc2 (Dense)                  (None, 4096)              16781312  \n",
      "_________________________________________________________________\n",
      "predictions (Dense)          (None, 1000)              4097000   \n",
      "=================================================================\n",
      "Total params: 138,357,544.0\n",
      "Trainable params: 138,357,544.0\n",
      "Non-trainable params: 0.0\n",
      "_________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "from keras.applications.vgg16 import VGG16\n",
    "model = VGG16()\n",
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "For this network, `model.predict` returns a 1000-dimensional probability vector containing the predicted probability that an image returns each of the 1000 ImageNet categories.  The dimensionality of the obtained output from passing `img_input` through the model is `(8, 1000)`.  The first value of `8` merely denotes that 8 images were passed through the network."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(8, 1000)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.predict(img_input).shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "### 3. Import the VGG-16 Model, with the Final Fully-Connected Layers Removed\n",
    "\n",
    "When performing transfer learning, we need to remove the final layers of the network, as they are too specific to the ImageNet database.  This is accomplished in the code cell below.\n",
    "\n",
    "![VGG-16 model for transfer learning](figures/vgg16_transfer.png)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "_________________________________________________________________\n",
      "Layer (type)                 Output Shape              Param #   \n",
      "=================================================================\n",
      "input_2 (InputLayer)         (None, None, None, 3)     0         \n",
      "_________________________________________________________________\n",
      "block1_conv1 (Conv2D)        (None, None, None, 64)    1792      \n",
      "_________________________________________________________________\n",
      "block1_conv2 (Conv2D)        (None, None, None, 64)    36928     \n",
      "_________________________________________________________________\n",
      "block1_pool (MaxPooling2D)   (None, None, None, 64)    0         \n",
      "_________________________________________________________________\n",
      "block2_conv1 (Conv2D)        (None, None, None, 128)   73856     \n",
      "_________________________________________________________________\n",
      "block2_conv2 (Conv2D)        (None, None, None, 128)   147584    \n",
      "_________________________________________________________________\n",
      "block2_pool (MaxPooling2D)   (None, None, None, 128)   0         \n",
      "_________________________________________________________________\n",
      "block3_conv1 (Conv2D)        (None, None, None, 256)   295168    \n",
      "_________________________________________________________________\n",
      "block3_conv2 (Conv2D)        (None, None, None, 256)   590080    \n",
      "_________________________________________________________________\n",
      "block3_conv3 (Conv2D)        (None, None, None, 256)   590080    \n",
      "_________________________________________________________________\n",
      "block3_pool (MaxPooling2D)   (None, None, None, 256)   0         \n",
      "_________________________________________________________________\n",
      "block4_conv1 (Conv2D)        (None, None, None, 512)   1180160   \n",
      "_________________________________________________________________\n",
      "block4_conv2 (Conv2D)        (None, None, None, 512)   2359808   \n",
      "_________________________________________________________________\n",
      "block4_conv3 (Conv2D)        (None, None, None, 512)   2359808   \n",
      "_________________________________________________________________\n",
      "block4_pool (MaxPooling2D)   (None, None, None, 512)   0         \n",
      "_________________________________________________________________\n",
      "block5_conv1 (Conv2D)        (None, None, None, 512)   2359808   \n",
      "_________________________________________________________________\n",
      "block5_conv2 (Conv2D)        (None, None, None, 512)   2359808   \n",
      "_________________________________________________________________\n",
      "block5_conv3 (Conv2D)        (None, None, None, 512)   2359808   \n",
      "_________________________________________________________________\n",
      "block5_pool (MaxPooling2D)   (None, None, None, 512)   0         \n",
      "=================================================================\n",
      "Total params: 14,714,688.0\n",
      "Trainable params: 14,714,688.0\n",
      "Non-trainable params: 0.0\n",
      "_________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "from keras.applications.vgg16 import VGG16\n",
    "model = VGG16(include_top=False)\n",
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "### 4. Extract Output of Final Max Pooling Layer\n",
    "\n",
    "Now, the network stored in `model` is a truncated version of the VGG-16 network, where the final three fully-connected layers have been removed.  In this case, `model.predict` returns a 3D array (with dimensions $7\\times 7\\times 512$) corresponding to the final max pooling layer of VGG-16.  The dimensionality of the obtained output from passing `img_input` through the model is `(8, 7, 7, 512)`.  The first value of `8` merely denotes that 8 images were passed through the network.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(8, 7, 7, 512)\n"
     ]
    }
   ],
   "source": [
    "print(model.predict(img_input).shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "This is exactly how we calculate the bottleneck features for your project!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
